Skip to content

Conversation

@LukeWood
Copy link

@LukeWood LukeWood commented Aug 9, 2020

I've recently been trying to take on RL again (bit of redemption after my
glaring defeat to box2d car racing 2 years ago!). In this pursuit I
came across some new analogies that were extremely useful to me in
creating a mental model for how some of these techniques work. Thought
they may be useful to some of your students.

This commit

  • adds an analogy for the purpose of the target network
  • emphasizes the reason experience replay works
  • adds a section on advanced RL techniques used to overcome sparse
    reward functions

I've recently been trying to take on RL again (bit of redemption after my
glaring defeat to box2d car racing 2 years ago!).  In this pursuit I
came across some new analogies that were extremely useful to me in
creating a mental model for how some of these techniques work.  Thought
they may be useful to some of your students.

This commit
- adds an analogy for the purpose of the target network
- emphasizes the _reason_ experience replay works
- adds a section on advanced RL techniques used to overcome sparse
reward functions
@LukeWood LukeWood requested a review from eclarson August 9, 2020 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants